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ABSTRACT 

THE Q UESTION OF VALII11TY AND 



APPLICATION: HOW DO WE KNOW HOW WELL 

WE ARE LISTENING 



The validity of the various listening tests have been established through inspection by listening 
theorists This stud/ sought additional support for these claims of validity. One hundred twenty 
students enrolled in basic speech courses were asked to complete the Receiver Apprehension Test 
(RAT) end take the Watson-Barker Listening Test- Form A. statistical analysis of the data revealed 
a significant correlation between RAT scores and both Long Term Memory and Total Listening, but 
not between RAT scores and Short Term Memory. The significant relationships were curvilinear in 
nature, as expected, based on the relevant literature. It was concluded that the claims of validity 
for the Watson-Barker instrument are partially supported by this data 

The paper concludes with a general discussion of progress in listening research and pedagogical 
advances in the listening field. 



[This material has been presented, by invitation, at SSCA in 1 986.] 



THE QUESTION Of VALWlTU AN* M^LMtATUm = ICPU 



Listening is the most widely used human means of receiving information. Countless studies 
have verified this generalization (Rankin, 1926; Wilt, 1949;Breiter, 1957; and Duker, 
1971). However, a concerted research interest in listening is relatively new. Duker (1964) 
mentions articles on listening that go back to the early 1 900's, but few studies were actually 
undertaken prior to the late 1 940's. After this data, interest increased dramatically. 
Pedagogical consideration of listening intensified during the fifties and early sixties, due in part, 
perhaps, to the research published by Nichols ( 1 948) and Brown ( 1 949). Scholars were aided 
immensely In these efforts by the availability of an instrument ( Brown and Carlsen, 1 955) that 
allowed for the diagnosing of listening comprehension skills. But almost as quickly as it grew, 
the interest in listening research declined 

In the mid-sixties a number of criticisms of listening tests, and indeed of the whole 
conceptualization of listening surfaced (See, for example, Becker, 1 963; Petrie, 1 964; Kelly, 
1 967). Perhaps for this reason, though the number of possible outlets for publishing research 
reports has increased dramatically since 1 960, there has been less published research on 
listening in the last ten years then there was in the 1 950's. This relative paucity of research is 
reflected in basic speech textbooks and, perhaps more critically, in the leading listening 
textbooks. Two of the most recent listening textbooks footnote as many studies done prior to 
1 960 as they do studies done after that date (Steil etgl, 1983; Wolvin etal. 1982). Other 
scholarly works fare no better, in 1 978, ERIC and SCA jointly Issued Assessing Functional 
Communication (Larson et al. 1978) in which listening assessment was discussed. Onh/oneof 
the references cited in the article was written within five years of the publication date of the 
article while nine were written prior to 1 960. 

Erway ( 1 972), in attempting to explain why less research was being published, .suggested 
that it was because it "has been difficult to measure valid changes in behavior because we have 
not yet decided what listening is" ( p. 22). A valid and reliable test had not been agreed upon by a 
majority of the listening researchers. We continued to focus on measurement, to the detriment 
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of theory building. Perhaps it is with our listening research focus as Delia ( 1 976) suggested it 
was with that of ethos and thaTmeasurement procedures rather than theoretic explication " have 
been of mxe importance and thus have prevented real understanding of the conceptual area. It 
would seem more prudent to first discover what it Is that we should be studying before deciding 
how we should measure it Definitions are key building blocks for theory. Any definition of 
listening accepted by researchers would not only help shape their theories, but also would guide 
their investigations and suggest particular methodologies. 

The focus on tests end not listening theory perhaps prompted Cronkhite ( 1 974) to suggest 
that research be undertaken to investigate "variables that influence the audience's ability to 
reliably evaluate messages," and to "turn our existing speaker-oriented research upside &wn 
to discover implications for critical listening" (pp. 81-82). Spregue ( 1 974) .too, called for 
the translation of "speaker-oriented, control-oriented theories and research findings into 
receiver-centered, choice expanding implications^ p.83). 

While the plea of such scholars for the creation of listening theories was persuasive, the 
successful translation of sender theories Into receiver theories has not yet happened. It may be 
that it never will. Crucial to its success is the implied linkage between encoding and decoding. If 
we are to flip these theories over so that they address themselves to effective listening rather 
than effective speaking, should we not first ascertain if there is such a connection? So far such 
a connection has been suggested by many, accepted axiomatical ly by some, end substantiated by 
no published research that this writer has discovered. The fulcrum that would allow us this 
Atlas-like task remains elusive We have yet to ascertain if listening end speaking mirror each 
other or "shadow" each other. Does one process reverse its opposite, repeat it, or are they 
totally different from each other? If they are reversible, then we can Indeed "turn our... 
research upside down." But if the latter is the case, data derived theories of speaking "only" 
need be generalized "right-side up" to listening. 

While there are indications that the decline in scholarly attention to :istening is ending, it 
does not appear that our- fixus has wavered from the methodological question of how to isst 
measure listening. The bulk of the current academic research effort seems to be concentrated on 
measurement rather than on the explication of listening theory. A new organization, the 
International Listening Associatio:>. was formed in the early 1 980's to foster support for 
listening research and education. The business community has Increased Its emphasis on 
listening training. However, we still lack a conceptual delimlnaticn of the concept that would be 
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acceptable to a majority of the listening researchers within our discipline. 

Differences in conceptualization and operationalization of variables abound in the literature. 
Such diversity is neither a surprise, nor a "curse." It allows for the emergence of the most 
robust theoretical explanation. Such definitional "battles" currently ere being waged between 
various contending listening tests. The number of such listening assessment devices available to 
the modern communication researcher and/or teacher is increasing. Just as the Brown-Carlsen 
Listening Comprehension Test seemed to spur research activity during the ten years or so after 
its inception, various listening tests, especially the Kentucky Comprehensive Listening Test and 
the Watson-Bark er Listening Test , seem to be prompting an increasing number of listening 
studies. But just as critics of the Brown-Carlsen Listening C omprehension Test guestiormed it;t 
validity, so too are there questions concerning the validity of these current measurement 
devices. 

The Watson- Barker Listening Test was developed in 1982 in en attempt to create a 
standardised listening test that would be oriented primarily toward adults and mature college 
level audiences (Watson and Barker, 1 984). The Kentucky Comprehensive Listening Test , also 
created in the early 1 980's, seems similarly oriented A number of reliability analyses have 
been conducted and acceptable levels of reliability established for both instruments. However, 
the only measure of validity undertaken for the Watson-Barker Listening Test w as that of "face 
validity"( Watson and Barker, 1 984, p. 1 ). Given the diverse definitions of "listening" held by 
various listening experts, such support is not totally reassuring. Other efforts at establishing 
validity ere being undertaken. Experiments are being conducted in an attempt to link test 
results of the Wetson-Berker instrument with those of other listening tests such as the 
Kentucky Comprehensive LIstenlnnTest While such experiments will help to establish the 
efficacy of comparing data of the various tests, they provide only a tautological validation of the 
instruments. If all tests are hiyhty correlated and if any one test is valid, then the validity 
claims of all tests can be accepted If no check of validity other than that of "face validity" is 
performed, all such claims should be held in abeyance until the concept of "listening" is agreed 
upon substantively by listening theorists. 

Bostrom ( 1 984) suggests that there ere several ways of establishing validity and that the 
"usual definition (measures what it is •supposed' to measure) does not exactly fit the kind of test 
that the KCLT represents" (p.2). He states that this is so because "each of the scales represents 
and [sic] actual instance of the performance of the skill in question^ p.2). Bostrom seems to be 
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avoiding the definitional battle by begging the question that while the definition of listening has 
not been agreed upon, the various subskills that his test measures have been accepted. If the 
"whole" has not been agreed upon, then the "parts" that make up the totality of that "whole "are 
no surer. While all agree either vigorously or, at least tacitly, that certain subskills such as 
"retention" do belong within the province of listening, others are argued about vehemently. 
There is quite a bit of disagreement concerning which various subproeesses should be included 
within the conceptualization of listening, is listening & combination of "hearing, understanding, 
and retaining" Information, or should other subprocesses be Included or some of these be 
excluded (Bostrom, 1 984)? 

The problems of establishing the validity of listening tests are monumental. Bostrom does 
discuss other methods of establishing validity. One procedure is to illustrate theft the 
Instrument In question measures a unique characteristic. He compares a wide variety of tests 
with his Kentucky Comprehensive Listening Test to illustrate its uniqueness Brown (1985) 
reports similar tests that suggest that the Brown-Carlsen Listening Com prehension Test 
measures something different from reading comprehension, Intelligence, and scholastic 
achievement While this data is compelling evidence that these Instruments measure unique 
constructs, it does not support the contention that they measure "listening ability." To say that 
something is not several other things is not the same as saying what it is. 

Bostrom ( 1 984) continues his quest for validity by illustrating that certain groups score 
differently than others on the test Specifically he Indicates that college students, army 
officers, and high school students have different performance levels. Bostrom suggests that "the 
KCLT does exactly what we might predict, showing different performance levels for each of these 
groups"( p.2). Knowing several members of each subject set. I suggest that none of the sets can 
boast of a uniform level of listening ability. Further, high school students out performed the 
other two groups in short-term listening and selective listening, while army officers scored 
better than both groups on lectures. I can find little theoretic or common sense backing for 
predictions in those directions. This is not to say that his instrument does not measure listening 
ability. Rather it Is to suggest thet he has not substantiated his case for the validity of his 
Instrument using this criterion. 

Regardless of the various conceptualizations of listening. It appears clear from the nature 
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of the instruments being used to measure "listening ability" that the one subprocess that is 
central to the measurement of listening is the "recall" or "recognition" of retained information. 
All tests share a common method Subjects are asked to listen to a message, or set of stimuli , end 
then are asked to recall or recognize various parts of that message or set of stimuli either 
immediately after hearing the test passages or at some delayed time thereafter: While the 
nature of the test passage varies from instrument to instrument, this procedure seems 
Invariant 

Another constant appears to be the effort on the part of the designers to hold "listening 
motivation" constant for all subjects. All of the major tests of 1fr<- ng ability ere administered 
in such a manner so that all subjects are aware that their listening is to to tested. Kelly 
( 1 967) points out the problems of external validity using this procedure when he notes, 

We have a massive body of information about the listening behavior of subjects 
who knew they were going to be tested. . . but we have done elmost nothing to find 
out about performances across the general range of situations from panic to 
boredom (p.464). 

This fo crucial to the external validity of listening tests when one considers that one of the 
most consistent findings in listening research has been that the recall of material Is facilitated by 
increases in extrinsic motivational cues. Forewarning of a test has been found to be such a cue. 
Knowledge that a test will follow a listening experience has been labeled "anticipatory set " 
Anticipatory set creates the real possibility that a "ceiling effect" may be established. Procedures 
that are common in listening measurement severely lim it the free functioning of any antecendent 
listening ability, as would be manifested in e "non- laboratory" situation. This phenomenon has 
been reported by many researchers (See, for example, Anastasis, 1961; and Kelly, 1962, 1965, 
1 967). Oronen and Mihevc ( 1 972) discuss how subjects under "aware" conditions actively listen 
to messages so that they might answer questions concerning the material at a later time. The effect 
of forewarning is to raise the motivational forcas naturally at work in the typical listener as high 
as his mental ability wttl allow and to disallow the differential functioning of other pertinent 
variables upon the comprehension and retention of material (Kelly. 1 967). This may well be the 
reason that correlations between measures of mental ability and intelligence, and such listening 
tests as the Brown-Carlsen Listening Bgrop£gjHBfeO Test and the STEP have been so high ( Keller, 
1960;Petne, 1961; Andersen and Beldauf, 1963). 
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Listening test designers should not be uninterested in studying the listening behavior of 
subjects under these conditions. Many classroom teachers hope that these conditions exist for 
them in their various courses. However , even a cursory inspection of the most ideal classroom 
will reveal that students are not motivated to listen, da/ in end day out, to the Information 
presented them. Many students seem to be content to remember information only so long as it 
takes to place that information in their notes. In any case, conditions where testing is immanent 
are not likely to be found in most other situations. 

Of particular interest then is the extent to which scores obtained 1n controlled conditions of 
standardized motivation reflect the listening ability of subjects when they venture outside the 
laboratory environment Resolving this question of external validity Is not an easy task , given the 
nature of the listening instruments extent today. While the Watson- Bar leer Listening Test does 
contain stimuli that are capable of being generated in a non-laboratory setting, the task of getting 
even one subject to respond to questions that would mirror the content of the test under conditions 
of "nonawareness of the intent to test" is too huge to seriously consider. The Ktrfrisr 
Comprehensive Listening Test contains many items that would not be found outside of the 
laboratory (though the distracting stimuli contained in one part of the test could well be). 

Although the dangers of the testing situation are obvious, research scholars are caught in a 
dilemma: if they warn subjects they are to be tested, the subjects are motivated ( thus 
"standardizing" the test conditions end making inoperative many factors that would normally affect 
comprehension) and the test becomes artificial; yet, If subjects are not warned, reliability 
suffers and it cannot be considered a feir test (Kelly, 1 97 1 , p.2 1 6) 

At least one other method for severing the largely tautological Gordian Knot of validity claims 
was suggested by the efforts of Bostrom (1984). While uniqueness is one characteristic of 
validity, shared commonality, as evidenced by significant correlations with valid measures of a 
phenomenon is acceptable support of a contention of validity. There are tests of established 
validity that are conceptualized to measure certain aspects of the listening domain. One such 
instrument is the Receiver Apprehension Test ( Wheeless, 1975). This instrument measures the 
self-reported anxiety of subjects that Is associated with listening to stimuli generated in a variety 
of situations. It has been studied in terms of its relationship to oiher self-report measures 
(Beatty, 1981 ; Beatty and Payne, 1981 ) and its psychometric propertHS (Beatty, in press). Of 
particular note Is the established correlation of RAT scores and physiological arousal ( Roberts, 



9 

ERIC 



Listening - page seven 

1 980, 1 984). This becomes even more important when the correlation between arousal and 
retention is entered into the equation. A number of researchers have established a link between 
retention end arousal (Kleinsmith and Kaplan, 1963; Crane eJLsI, 1971; Roberts, 1980). The 
relationship between arousal and retention is posited to be curvilinear In nature, while the 
realtionship between physiological arousal and RAT scores is linear. Since listening ability is said 
to reflect shor t term retention and long term retention ability, in part, then there should be a 
correlation between RAT scores and scores on valid listening tests. This relationship would be 
curvilinear In nature. Too much or too little physiological arousal, as indicated by RAT scores, 
would result in poorer retention scores, as reflected by scores on a listening test Optimum levels 
' of arousal would result in higher retention scores. 

In order to test the validity of the Watson-Barker Listening test, the following hypothesis was 
conceived: 

There is a curvilinear relationship between receiver apprehension, as measured by the 
RAT, end listening ability, as measured by the Watson-Barker Listening Test. 

METHOD 

SUBJECTS: Subjects were 1 27 volunteer undergraduate students, 42 males and 85 females, 
enrolled in beginning speech communication courses at a four-year university during the Spring 
semester of 1 985. Data of seven of the subjects was subsequently discarded for several reasons. 
Three of the subjects were from other countries and their grasp of the English language prohibited 
en accurate test of their listening ability. Four other subjects did not complete one or both of the 
instruments utilized in this experiment 

PROCEDURE At the beginning of the Spring semester , students in six sections of a basic speech 
communications class were asked to volunteer for an experiment The purpose of the experiment 
was explained to them in detail and the procedures that would be followed were outlined. They 
were assured that the tests would have no impact on their grade, nor would their decision to 
participate or not participate affect their standing in the class, with only one exception, all 
students agreed to participate. The one non- volunteer was excused from the next class meeting. 
At the next class meeting the subjects were asked to complete the Receiver Apprehension Test 
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( Wbeeless, 1 975). After collecting the RAT, subjects were asked to complete the Watson-Barker 
Listening Test. Form A I Watson and Barker. 1984). This test requires students to listen to a 
twenty minute audio tape and answer questions based on the information presented on the tape. 
There are five different types of listening tasks asked of the subjects. Each section of the test is 
comprised of ten questions. Three of the sections are said to test "short term memory skills" and 

the remaining two sections are purported to assess "long term memory skills"( Watson and 
Barker, 1 984). The test tape begins with a short passage that allows the experimenter to Insure 
that all subjects can hear the tape adequately. After adjusting the volume control of the tape 
player, the taps was played for the subjects, pausing only briefly to allow subjects to turn the 
pages of their test booklets when required. Although these pauses were not called for in the 
instructions provided with the test, they were deemed necessary because of the potential for 
distortion that the extraneous noise resented. The actual test time required varied slightly from 
class to class (the average time required for completing the Watson-Barker Listening Test was 
approximately 30 minutes). After the subjects had completed the test, their answer sheets were 
collected, they were asked to refrain from discussing the tests with others who might subsequently 
participate in the experiment, and were assured that their test answers would he evaluated , 
shored and explained to them at the next regular meeting of the class. 

RESULTS 

The completed tests were scored according to directions provided by the designers of the two 
instruments. As indicated above, four of the subjects failed to complete one or both of the tests and 
the tests of three othersubjects were discarded because it was evident that they ffld not unde^ 
English well enough to have their listening ability effectively measured by the Watson-Barker 
instrument Pearson product-moment correlations were obtained for the scores of the remaining 
1 20 subjects on the RAT and the Wstscn-Barker test measures of short term memory, long term 
memory, and total listening ability (short term memory plus long term memory). As suggested 
by the literature concerning the nature of the relaticnshi p between arousal , as tapped by the RAT 
instrument, and the retenticr. "...tension measured by listening tests, no significant relationships 
were established for total listening ability , short term listening, or long term listening 
(respectively the results were r=. 12, r= 13,r=.06;p>.05). 
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While a certain level of arousal is necessary to perform cognitive tasks successfully, 
arousal levels beyond the optimum "readiness" level are dysfunctional (Gofer end Appley , 
1 964). As indicated above, previous research has shown that there is e significant linear 
correlation between RAT scores and physiological arousal. A direct relationship between 
memory and physiological arousal has been established as well. This relationship has been 
shown to be curvilinear in nature, in line with the "Activation Hypothesis" of Gofer and Appley. 
Since the Watson-Barker instrument does claim te measure retention, the relationship between 
Hand the RAT most probably would not be linear in nature, but rather would be curvilinear in 
nature. The further the RAT scores are from the mean RAT score, the lower the Watson-Barker 
scores should ba 

To test this proposed "inverted U-shaped" relationship, the 1 20 scores were arrayed on a 
scatter diagram and visually examined. This analysis strongly suggested that the relationship 
was not linear in nature. To statistically test this moonship the RAT scores of the 1 20 
subjects were converted to absolute scores from the mean of the population (meen>40.89) and 
Pearson product-moment correlations were obtained for the adjusted RAT scores and the 
Watson-Barker scores of short term memory, long term memory, and total listening ability 
(Rosenthal and Rosnow, !984,pp.222-224). Significant relationships were found to exist 
between the adjusted RAT scores and long term memory (r=- 20, p<.03) and between the 
adjusted RAT scores and total listening ability (r»-.21 , p<.02), but not between the adjusted 
RAT scores and short term memory ( r=- 12, p<.1 8). The power of the correlation test was .7 1 
(Cohen, 1977). 

D1SCUSSIOM 

The hypothesis was supported with regard to the relationships among the RAT scores and 
both long term memory and total listening ability, but not between short term memory and RAT 
scores. Previous researchers have suggested a strong I ink between arousal and long term 
retention, and a relatively weaker link between arousal and short term retention ( Levonien, 
1 967; Roberts, 1 98Q)r These findings are in line with those results, Taken together with the 
previous literature on the arousal-retention relationship, this study provides evidence for the 
validity claims of the Watson-Barker Listening Test 

Establishing the validity of any new instrument is difficult Given the relatively smal 1 
portion of variance of listening scores that Is accounted for by the RAT measure, definitive 
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conclusions concerning the validity of this new instrument must wait for additional data 
collection. Although the amount of variance accounted for is small , its magnitude is in line with 
Barker's ( 1 984) conceptualization of listening which posits at least six different sufaprocesses 
as being Involved with the listening process. 

"Recall" is only one of these six processes and the only one to which the RAT has been 
empirically linked. It may well be that recall is of less importance than "attention," "hearing," 
"understanding," or any of the other possible subprocesses of listening, Insofar as total listening 
scores are concerned. 

However, this study does add weight to the claims of external validity fx the 
Watson-Barker instrument Further testing of the relationship between this listening test and 
measures of "attention," "understanding," eta, would help to increase confidence in this 
procedure. A more direct test of the relationship between listening scores on the 
Watson-Barker test and physiological arousal seems called for as well. 

One additional note of caution is called for, based on the r esear ch project outlined above. 
While many claims of "face validity" have been made by the designers of listening tests, most of 
these tests seem , on the surface, to fail that test of validity because of the single medium nclure 
of the test stimulus. Listeners generally do not "listen" with just their ears. Listening typically 
takes place while the listener is hearing and viewing the sender of the message. While 
attempting to essess the listener's ability to analyze the perelanguags message as well es the 
verbal message is indeed a useful pursuit, neglecting to measure the listener's ability to gain 
knowledge from the other aspects of nonverbal message transmission may render the total 
testing procedure useless in terms of applying the results to everyday encounters. Efforts ere 
being undertaken to develop a listening test that more accurately measures the full range of 
decoding activities that the typical "listening" task involves. This new measurement procedure 
would include both the aural and the visual stimuli that are present in most communication 
situations It is hoped that this new version of the Watson-Barker Listening Test will be found 
to be an even more valid and reliable measure of that nebulous concept we cell listening. 

That that research task should be undertaken before the definition battle outlined ebove is 
resolved is a moot point The simple fact is that it will be done The interest is there, end the 
need for such a tool is evident. Without it, we can not hope to develop effective methods for . 
listening instruction. Given the rather sketchy evidence available, It is difficult to argue 
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with Erwey's ( 1 972) contention that gains from listening instruction are not maintained over 
time. Most listening research studies are "quick and dirty." Few longitudinal studies have been 
done. The generelizebility of most studies is severely limited by the nature of the subject 
population drawn upon. By far the most prevalent educational level in listening research is the 
elementary school level. Fewer studies have been carried out at the secondary level , and only a 
handful have been completed using college-age subjects. This inverse relationship between the 
amount of studies and the age of subjects seems to mirror the relationship between age and 
potential for listening improvement that some researchers have alluded to in their articles 
(Evans, I960; Evertts, 1962; Lieb, 1965). 

A close reeding of most listening texts reveals that there is little reason to support the 
contention that we currently ere effectively teaching listening. For such support we continue to 
have to fall beck upon the subjective judgments of other teachers of listening. Erway ( 1 972) 
has suggested, "the most impressive evidence comes not from research but from the prejudiced 
reports of students who have experienced instruction and from the observation of instructors" 
( p.23). This "evidence" must be considered especially suspect in 1 Ight of the finding that people 
tend to think more highly of themselves as listeners then test scores indicate end that they ere 
less able to discriminate between good and poor listening than they are between good and poor 
speaking ( Stark , J 956). While it can be argued persuasively that we should teach listening at 
al 1 educational levels, the only well documented listening finding is that listening is not being 
taught in most academic institutions. 

Implementing longitudinal investigations that would document effective methods for teaching 
listening would help to reverse this tendency towards lip service. If scarcity does incr e as e the 
value of a commodity, the results of such studies done in the classroom situation would prove 
very worthwhile. Prior to 1 970 , only fifteen empirical studies investigated pedagogical 
phenomena by first teaching teachers to behave in some particular way, then observing them to 
make sure they did behave in that way, and, finally, testing their students to note changes 
(Sprague, 1 974). As noted previously, there ere pronounced problems in generalizing 
laboratory research to the classroom and beyond What Is lost in terms of ability to contr i end 
limit experimental artifacts would be made up for in terms of the vigor end power of the 
generelizebility of the resultant data 
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Until such experiments are conducted, teachers interested in increasing listening skills can 
do no better than rely on the unsubstantiated platitudes that currently make up the bulk of our 
listening instruction. We will continue to tell our students to "Withhold evaluation of the 
message until the speaker is finished" (Barker, 1 984, p. 55) and hope they don't ask us too 
many questions about the research that indicates that that is appropriate behavior. There is no 
research documentation that would support such imperatives. One even could argue that such a 
course of action is inefficient since it causes you to listen to unimportant as well as senseless 
drivel. Further , even if that inefficiency were shown to be necessary and/or useful , no 
pedagogical direction is available that would allow a teacher to help students carry out that 
directive. How does one "withhold evaluation" on the attitudinal level? Does the evaluation only 
matter if done on the "conscious" level? Does it matter if people do evaluate a speaker, if the/ 
still continue to listen to him? 

The order in which research questions should be tackled is dictated, to a certain degree, by 
the urgency of situation. First we need to develop measures that are valid measures of listening, 
regardless of where and under what circumstances that activity takes place. Perhaps several 
instruments will be needed to cover 811 of the important contexts we wish to tap into. Then 
expediency necessitates that we undertake investigations to ascertain how we can best facilitate 
more effective listening. It may well be that our listening texts have more substance than 
alluded to above. If research reveals that there indeed are founts of knowledge and potent 
developers of skills alreaoy extent, more weight can be applied in the effort to wedge in listening 
instruction in our alread/ crowded curricula If none of our current teaching imperatives are 
supported, future r esear ch directions will be more clear and the weight of unsubstantiated 
dogma *lll no longer have to be borne by listening Instructors. Which ever the case, we need to 
go forward. 

As long as we lack such research we shall be oound to myths 
and superstitions which are interesting subject matter for 
our methods courses, but which have little relevance for the 
real world (Sprague, 1 974). 
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